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Introduction.  Work  done  under  this  contract  may  conveniently  be  divided 
into  three  parts. 

I.  From  time  to  time  ve  have  explored  certain  problems  at  the  request  of 
the  Director  of  a  classified  project  in  another  Naval  Research  unit.  Reports 
have  been  made  directly  in  consultation  with  this  project.  Work  was  discon¬ 
tinued  in  each  case  when  the  practical  application  was  no  longer  Important. 


II.  Because  of  the  rapid  progress  made  In  the  course  of  this  research, 
many  technical  ittprov amenta  and  developments  have  beoome  necessary,  some  of 
which  should  be  regarded  as  Important  results.  These  are  grouped  together  and 
discussed  below  under  Section 


f 


III.  Experimental  results  In  the  designated  fields  of  research  are  des¬ 
cribed  below  in  Section  III.  These  may  be  listed  under  two  principal  headings. 

1.  We  have  explored  .several  problems  In  the  chaining  of  responses, 
r?.rticulerly  in  connection  with'  the  effect  of  a  delay  before  reinforcement?  ^ 
Some  of  -f*ar  results  are  in  press  in  a  paper  by  Dr.  Ferster,  "Sustained  Behavior 
under  Delayed  Reinforcement"  to  be  published  in  the  Journal  of  Experimental 


loycholo&y.  „ _ — 

V  to. 

2.  The  major  part  of  .out;  work  has  been  conoemed  with  the  effects 
of  different  schedules  of  reinforcement  and  different  contingencies  of  rein¬ 
forcement  upon  probability  or  frequency  of  response. ^Some  of  these  results  have 
been  reported  in  a  lecture  given  by  Dr.  Skinner  in  Stookholm  in  July,  1951* 

Inis  paper  is  to  be  reprinted  in  The  American  Psychologist.  A  much  more  ex¬ 
haustive  report  will  be  published  in  book  fora  by  the  Macmillan  Company  under 
the  Joint  authorship  of  Drs.  Skinner  and  Ferster,  the  book  orobably  to  be  * 
celled  "Schedules  of  Reward."  It  will  include  not  only  the  major  part  of  o at  ^ 
results , under  the  present  contract,  but  also  some  of  the  results  to  be  obtained 
under  ourliew  Contract  H5ori-O7o56. 

A 


I.  Points  Checked  for  Practical  Application 

Visual  acuity  in  close-up  vision.  The  classified  project  with  which  we 
have  been  associated  was  at  one  time  interested  in  the  visual  acuity  of  the  pig¬ 
eon  in  close-up  vision.  At  the  request  of  the  Director,  we  studied  this  problem 
with  two  kinds  of  visual  material.  In  the  first  experiment,  photographic  nega¬ 
tives  were  supplied  by  a  Naval  Research  Office  which  were  used  to  establish  a 
discrimination  in  the  pigeon  between  blocks  of  three  bars  presented  vertically 
or  horizontally,  the  blocks  being  square  and  with  variable  dimensions  with  out¬ 
ride  dimensions  proportional  to  the  thickness  of  line.  With  this  material  we 
establish  that  pigeons  can  easily  form  discriminations  when  the  side  of  the  en¬ 
closing  square  is  no  greater  than  l/32  of  an  inch  where  the  visual  material  is 
approximately  1  lA  inches  from  the  eye  of  the  pigeon. 

For  r, Tester  reliability  we  turn  to  the  use  of  Ronchi  rulings.  Again  a  dis¬ 
crimination  was  established  between  vertical  and  horizontal  positions.  We  were 
able  to  demonstrate  the  ability  of  the  pigeon  to  discriminate  between  these 
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positions  with  rulings  at  more  than  100  per  inch.  At  this  point  we  encoun¬ 
tered  technical  problems  concerned  with  defractions  and  with  a  source  of  light 
which  we  were  not  equipped  to  solve  and  since  the  practical  question  had  been 
answered  to  the  satisfaction  of  the  other  project,  this  research  was  dropped. 

Reaction  time.  Another  practical  problem  was  the  minimal  reaction  time  of 
the  pigeon  to  visual  stimulation  with  and  without  a  ready  signal.  We  found 
that  it  was  possible  to  establish  reaction-time  behavior  in  the  pigeon  com¬ 
parable  with  that  of  the  human  subject  and  the  reaction  times  appeared  to  be 
in  the  same  order  of  magnitude.  This  work  was  also  abandoned  because  our  re¬ 
sults  indicated  a  satisfactory  condition  with  respect  to  the  practical  prob¬ 
lems  of  the  other  project. 


II.  Technical  Developments 

Introduction.  One  of  the  results  of  our  general  research  in  this  area 
was  a  demonstration  that  such  a  subject  as  a  pigeon  could  respond  at  rates  as 
high  as  20,000  responses  per  hour  over  long  periods  of  time.  We  found  it  pos¬ 
sible  to  extend  our  experimental  ooriocl  to  as  much  as  twelve  or  fifteen  hours 
a  day.  These  results  presented  severe  technical  problems  in  instrumentation 
and  not  the  least  of  contribution  under  this  project  has  been  the  development 
of  apparatus  suitable  for  such  research.  Some  o:’  the  more  important  develop¬ 
ments  are  as  follows: 

Keys .  The  response  we  have  studied  is  the  behavior  of  the  pigeon  in  peck¬ 
ing  at  a  small  circular  key  1  inch  in  diameter  .resented  behind  an  open  window. 
The  operation  of  this  key  presents  several  problems.  Ordinarily  the  pigeon 
will  develop  only  as  much  energy  as  is  needed  to  move  the  key  and  usually 
reaches  a  marginal  energy  where  failure  of  many  types  of  keys  raises  e  serious 
problem.  We  have  developed  several  kind3  of  mechanical  and  electronic  keys 
having  high  natural  frequencies  and  great  sensitivity.  We  have  recently  devel¬ 
oped  a  key  in  which  the  contact  system  is  closed  to  prevent  fowling  from  the 
dust  which  is  characteristically  given  off  by  the  pigeon  under  experimental  con¬ 
ditions.  Several  of  our  models  have  natural  frequencies  above  any  frequency 
within  the  capacity  of  the  pigeon  (15  responses  per  second)  which  will  operate 
for  several  million  operations  without  attention. 

Relay  equipment.  A  special  problem  in  research  of  this  sort  is  to  estab¬ 
lish  different  controlling  systems  with  speed  and  flexibility.  We  have  worked 
out  e  technique  of  mounting  relays  and  other  pieces  of  equipment  on  -panels  which 
can  be  quickly  assembled  with  snap  leads. 

Recorders .  We  have  characteristically  used  a  cumulative  recorder.  Neces¬ 
sary  features  include  (l)  a  high  rate  of  responding,  (2)  continuous  duty  over 
periods  of  many  days,  end  (3)  successful  operation  without  attention  for  many 
hundreds  of  millions  of  responses.  A  recorder  of  this  sort  must  reset  auto¬ 
matically  when  the  pen  has  moved  from  one  edge  of  a  paper  to  the  other  and  must 
lepo.vt  other  operations  m  the  form  of  small  pips  on  the  cumulative  curve. 

Wr-  have  developed  three  successive  mcde.lo  of  recorders,  the  last  of  which 

polroo  t.'iono  Iraj),)  j. c'Of.en.ihly  Well. 


Equipment  for  concept  formation.  We  have  developed  a  device  which  will 
present  visual  material  for  experiments  in  concept  formation  using  an  automatic 
slide  projector,  each  slide  carrying  not  only  the  material  hut  a  coding  of 
spots  of  light  which  program  an  experiment.  This  apparatus  has  not  yet  been 
extensively  used  because  for  strategic  reasons  wo  have  been  using  the  equipment 
for  somewhat  more  basic  problems. 

Schedule  programmer.  Some  of  the  more  complicated  schedules  of  reinforce¬ 
ment  which  we  have  studied  have  required  improvements  in  our  programmers.  We 
have  developed  a  programmer  for  variable-interval  and  variable-ratio  reinforce¬ 
ment  which  uses  a  teletype  transmitter  in  connection  with  a  relay  tree.  This 
permits  the  easy  handling  and  construction  of  various  schedules  of  reinforcement 
on  both  an  interval  and  ratio  basis. 

Magazines .  With  the  refinements  in  our  experimental  data  it  has  become 
important  to  present  food  as  a  reinforcement  for  control  periods  of  time.  Some 
types  of  magazines  permit  the  bird  to  gather  up  extra  grains  after  the  period 
of  reinforcement  prop sr  has  come  to  an  end.  We  have  developed  two  types  of 
magazines  which  resent  a  hopper  of  grain  to  a  -pigeon  for  a  controlled  period  of 
time;  then  remove  the  grain  at  the  end  of  that  time  so  that  the  pigeon  stops 
eating  immediately  end  returns  to  the  behavior  being  studied. 


III.  Experimental  Results 


Introduction.  Since  the  experiments  to  be  described  differ  considerably 
from  txiose  encountered  in  general  in  the  field  of  animal  behavior,  a  few  intro¬ 
ductory  comments  may  be  in  order.  We  are  concerned  here  with  the  behavior  of 
an  individual  organism.  Rarely  do  we  concern  ourselves  with  e-craged  curves 
or  data  of  any  sort.  We  ere  also  concerned  with  the  continuous  record  of  the 
performance  of  the  organism  in  which  the  probability  or  frequency  of  response 
at  any  given  moment  may  be  determined.  These  periods  of  experimentation  extend 
up  to  fifteen  hours  per  day  on  a  daily  basis.  The  data  with  which  we  are  pri¬ 
marily  concerned  is  the  momentary  rate  of  responding  or  changes  in  that  rate  as 
a  function  of  different  variables.  The  cumulative  records  wo  obtain  are  read 
with  respect  to  momentary  elope,  curvature,  and  the  fine  "grain"  of  the  record 
as  on  indication  of  the  degree  of  uniformity  or  regularity  of  the  behavior. 

In  general  we  have  studied  the  "steady  states"  achieved  under  various  schedules 
of  reinforcement,  as  well  as  the  stages  leading  up  to  these  steady  states  and 
the  extinction  of  the  behavior  at  any  such  stage.  In  general  we  have  studied 
email  numbers  of  pigeons,  one  or  two  ordinarily  being  used  to  study  the  effect 
'  f1  a  given  sot  or*  conditions-.  However,  since  our  experiments  overlap  and  inter  - 
la  considerably,  we  usually  acquire  in  the  end.  a  relatively  large  number  of 
cases  to  -.rtatlish  any  given  point.  Rata  at  the  present  time,  however,  consist 
lip  inly  .  f  liustra  tvo  records.  These  are  not  selected  as  the  best  records  to 
U  web  ’i  TKn.  ur.dr-.  *  giver.  s  v:  of  conditions,  but  are  offered  as  representative 
an’.  tyri-ril  ■'* .  mp  In"  •’*  .■  i  where  repeated  experimentation  has  satisfied 

+  ua~:c. r  me-o  ecu  ?'  the  a  o-.iucibility  of  the  results  in  every  detail. 


ferulim.*  Chaining  of  Responses  and  Delayed  Reinforcement 

Techniques  were  developed  so  that  problems  in  chaining  of  responses  could 
be  studiod  by  reinforcing  a  single  response  with  the  appearance  of  a  stimulus 
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which  is  the  occasion  upon  which  the  same  response  then  produces  food.  Through 
the  use  of  this  technique  we  found  that  the  effect  of  a  delay  In  reinforcement 
on  the  probability  of  a  response  depends  upon  the  formation  of  another  link  in 
a  chain  of  verccnses.  Some  incidental  response  occurring  during  the  delay 
period  is  accidentally  reinforced.  Under  certain  schedules  it  acquires  a  high 
proca'r  ility.  In  reality,  then,  an  apparent  delay  in  reinforcement  la  simply 
the  reinferoG'aont  of  u  second  response  in  a  chain.  The  factors  governing  the 
effect  of  <3  nurtiouler  delay  and  the  effect  of  a  change  from  one  delay  to  another 
were  the  sane  as  thoje  tatoraining  the  formation  of  a  chain  of  responses . 


Results:  Schedules  of  Intermittent  Reinforcement 

The  types  of  schedules  wo  have  studied  may  be  summarized  as  follows: 

The  rerponse  to  be  reinforced.  can  be  determined  by  an  external  clock  (inter¬ 
val  r;i r'.erc. *ntr.t)  or  by  the  r'ovious  responding  of  the  or^urasra  itself  (ratio 
ronnfoc moment) .  Wc  have  stuliod  the  following  variations: 

Fired.  Reinforcement  may  occur  at  the  end  of  fixed  intervals  or  after 
fired  it oios  of  unreinforced  to  reinforced  responses. 

Variable.  Reinforcement  may  occur  at  the  end  of  a  variable  Interval  or 
after  e  \  triable  number  of  responses  (or  ratio)  since  the  precoding  reinforcement. 

Two -Valued.  A  case  of  some  theoretical  interest  is  reinforcement  after  an 
interval  which  assumes  either  one  of  two  values  on  an  unpredictable  schedule 
(vwo-vaiued  interval)  or  after  a  number  of  responses  of  either  one  of  two  val¬ 
ues  on  an  unpredictable  schedule  (two-valued  ratio). 

Tandem.  The  mechanisms  operating  under  intermittent  reinforcement  are  some¬ 
what  clarified  by  certain  tandem  schedules. 

1.  The  principal  schedule  Is  fixed  interval,  but  at  the  end  of  each 
interval  the  schedule  changea  to  e  small  ratio,  e.g.,  at  the  end  of  ton  minutes 
the  organism  must  omit  ton  responses  and  will  be  reinforced  on  the  last  of  these. 

2.  The  principal  schedule  is  fixed  ratio,  but  at  the  end  of  each 
ratio  the  organism  is  reinforced  after  the  lapse  of  a  small  interval  of  time, 
e.g. ,  the  pigeon  responds  a  hundred  times  and  is  then  reinforced  for  the  first 
response  after  ten  seconds  have  elapsed. 

3.  Other  interesting  tandem  schedules  ere  variable  Interval  leading 
to  a  sinal  1  ratio  end  variable  ratio  leading  to  a  small  Interval*. 

Mired.  Same  important  implications  of  intermittent  reinforcement  follow 
from  <v:  *  is  in  sen  sdules  in  which  the  organism  is  reinforced  on  different  schedules 
for  .or!  jt.«rtidl  periods  of  tine,  e.g.,  on  a  fixed- interval  schedule  for  one  hour, 
ther.  a  ,er_Pblp- interval  schedule  for  one  hour,  and  so  on.  This  permits  us 
to  observe  the  transition  from  one  schedule  to  another. 

Interpolated.  In  a  variation  on  a  mixed  schedule  a  short  period  of  rein¬ 
forcement  on  one  schedule  is  interpolated  into  a  background  schedule — e.g.,  a 


short  run  of  fixed-ratio  reinforcements  is  introduced  into  a  "background  schedule 
of  fixed  interval. 


Interlocking.  A  schedule  of  reinforcement  which  has  many  analogues  in  the 
field  of  social  "behavior  is  one  in  which  the  organism  is  reinforced  on  a  combina¬ 
tion  of  interval  and  ratio  schedules.  On  one  such  interlocking  schedule,  for 
example,  the  organism  must  emit  a  relatively  large  number  of  responses  if  it  is 
responding  rapidly,  but  will  be  reinforced  for  a  smaller  number  if  it  responds 
more  slowly.  In  this  system  the  reinforcing  mechanism  is  affected  by  the  per¬ 
formance  of  the  organism  in  a  unique  way. 

Adjusting.  Another  way  in  which  the  reinforcing  system  may  be  modified  by 
the  behavior  of  the  organism  is  exemplified  by  8  schedule  which  changes  pro¬ 
gressively  in  terms  of  the  performance  of  the  organism  at  earlier  stages  in  the 
experiment.  Where  the  interlocking  schedule  changes  during  a  single  interval  be¬ 
tween  reinforcements,  the  adjusting  schedule  changes  after  a  given  interval  with 
respect  to  the  interval  which  is  to  follow.  In  an  adjusting  ratio  schedule  the 
number  of  responses  which  the  organism  must  emit  before  being  reinforced  varies 
with  the  rate  of  responding  during  earlier  stages  of  the  experiment. 

Multiple.  Any  of  the  above  schedules  may  be  combined  and  placed  under 
stimulus  control.  For  example,  the  key  which  the  pigeon  strikes  may  be  colored 
in  different  ways.  When  one  color  prevails,  the  pigeon  is  reinforced  on  one 
schedule;  when  another  color  prevails,  it  is  reinforced  on  another.  This  is  o 
multiple  tandem  schedule.  In  a  multiple  concurrent  schedule  the  bird  has  ac¬ 
cess  to  two  or  more  keys.  These  ere  necessarily  under  stimulus  control  because 
of  their  position  but  the  stimulus  control  i3  also  heightened  by  adding  dif¬ 
ferent  colors  to  the  keys.  One  key  is  reinforced  on  one  schedule;  another  key 
on  another. 


Additional  Contingencies  of  Reinforcement 

In  addition  to  schedules  as  such,  it  is  possible  to  reinforce  an  organism 
in  terms  of  its  rate  at  the  moment  of  reinforcement.  For  example,  the  organism 
may  be  reinforced  on  any  one  of  the  above  schedules,  but  only  when  its  rate  is 
momentarily  above  a  given  value  or  momentarily  below  a  given  value. 


Added  Stimulus  Feedback 

Under  these  various  schedules  and  contingencies  of  reinforcement,  the  or¬ 
ganism  is  reinforced  in  the  presence  of  self -generated  stimuli  which  are  impor¬ 
tant  in  the  final  determination  of  its  behavior.  It  is  possible  to  test  the 
importance  of  such  stimuli  by  adding  external  stimuli  correlated  with  reinforce¬ 
ment  in  the  same  ways.  For  this  purpose  we  have  used  a  small  spot  of  light  pro¬ 
jected  on  the  key  which  the  pigeon  strikes.  The  spot  may  be  made  to  vary  with 
lapsed  time  since  the  preceding  reinforcement,  with  the  number  of  responses 
made  since  the  last  reinforcement,  or  with  the  momentary  rate  of  responding. 

The  feedback  associated  with  the  number  of  responses  made  in  a  given  period  of 
time  may  be  permitted  to  "fade"  according  to  some  temporal  schedule.  Any  of 
these  types  of  feedback  may  be  added  to  any  of  the  schedules  or  contingencies 
already  listed. 
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Probes 


All  of  the  preceding  schedules  may  be  established  in  advance  of  a  given 
experiment  and  the  behavior  of  the  pigeon  may  be  studied  either  in  successive 
stages  or  in  a  final  "steady  state."  Some  information  about  relevant  variables 
may  be  obtained  by  introducing  single  events  into  such  a  program.  For  example, 
a  single  response  may  be  reinforced  at  any  point  in  a  standard  record  to  de¬ 
termine  the  effect  of  a  single  reinforcement  as  such.  Wo  have  probed  behavior 
with  a  "blackout."  The  lights  in  the  apparatus  are  turned  out  so  that  the  bird 
does  not  (typically)  respond.  During  this  period  of  "dead  time"  the  stimuli 
automatically  generated  by  behavior  are  presumably  permitted  to  grow  weak. 


RESULTS 

We  have  studied  the  behavior  of  small  groups  of  animals  under  all  of  the 
schedules  and  contingencies  listed  above.  In  general  the  behavior  of  the  pigeon 
appears  to  be  under  the  control  of  the  stimulating  conditions  which  prevail  a.t 
the  moment  of  payoff.  These  are  the  actual  effect  upon  the  organism  of  any 
given  schedule  of  reinforcement  or  of  any  special  contingency  or  of  supplemen¬ 
tary  stimulation  provided  as  a  feedback.  It  is  anticipated  that  the  present 
research  together  with  further  research  which  is  continuing  under  another  con¬ 
tract  will  provido  8  fairly  simple  picture  of  the  relevant  conditions.  Such  a 
formulation  would  permit  us  not  only  to  represent  the  effect  of  all  these 
schedules  tut  to  predict  the  effect  of  any  new  schedule  in  advance.  At  the 
present  time  our  results  consist  of  a  very  large  number  of  records  obtained  un¬ 
der  these  schedules.  These  are  quite  consistent,  orderly,  and  highly  reproduc¬ 
ible  from  one  organism  to  another.  Some  typical  results  were  described  in  the 
lecture  already  referred  to  and  the  following  excerpt  from  that  report  will 
show  the  general  nature  of  this  work. 

Let  us  begin  with  the  case  in  which  reinforcements  are  arranged  by  a 
clock.  We  may  represent  ouch  a  schedule  by  drawing  vertical  lines  on  our 
cumulative  graph.  In  Fig.  1  the  lines  are  5  minutes  apart.  We  reinforce 
a  response  es  soon  as  the  pen  reaches  the  first  line,  regardless  of  how 
many  responses  have  been  made.  We  reinforce  again  when  it  reaches  the  se¬ 
cond  line  and  so  on.  In  other  words,  we  simply  reinforce  responses  at  in¬ 
tervals  of  approximately  5  minutes.  Call  this  "fixed- interval  reinforce¬ 
ment."  The  organism  quickly  adjusts  with  e  fairly  constant  rate  of  res¬ 
ponding,  which  produces  a  straight  line  with  our  method  of  recording.  The 
rate  --  the  slope  of  the  line  --  is  a  function  of  several  things.  It  varies 
with  difficulty  of  execution:  the  more  difficult  the  response,  the  lower 
the  slope.  It  varies  with  degree  of  food  deprivation:  the  hungrier  the  or¬ 
ganism,  the  higher  the  slope.  And  so  on.  It  will  be  soen,  moreover,  that 
such  a  record  is  not  quite  straight.  After  each  reinforcement  the  pigeon 
pauses  briefly  --  in  this  case  for  30  or  40  seconds.  This  is  due  to  the 
fact  that  under  a  fixed -interval  schedule  no  response  is  ever  reinforced 
just  after  reinforcement.  The  organism  is  able  to  form  a  discrimination 
based  upon  the  stimuli  generated  in  the  act  of  eating  food.  (The  rrocess 
of  forming  such  a  discrimination  has  been  thoroughly  investigated  with 
stimuli  which  may  be  better  controlled.)  So  long  as  this  stimulation  is 
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effective,  the  rate  is  lov.  Thereafter  the  organism  responds  at  essen¬ 
tially  a  constant  rate.  It  would  appear  that  stimuli  due  to  the  mere  pas¬ 
sage  of  time  are  not  significantly  different  to  the  organism  during  the  re¬ 
maining  part  of  the  interval.  The  organism  cannot,  so  to  speak,  tell  the 
difference  between  say,  5  and  4  minutes  after  reinforcement  under  these 
circumstances . 

In  this  record  the  rate  of  responding  assumes  two  values  --  it  is 
zero  so  long  as  stimulation  from  the  preceding  reinforcement  is  effective; 
otherwise  it  is  fairly  high  and  constant,  so  long  as  the  several  factors 
just  mentioned  are  not  changed.  Now,  the  pigeon  is  presumably  stimulated 
by  its  own  behavior.  It  must  possess,  so  to  speak,  a  crude  "speedometer" 
which  tells  it  how  fast  it  is  responding.  Under  fixed- interval  reinforce¬ 
ment  there  are  usually  only  two  readings  --  "zero"  and  "fairly  fast."  The 
pigeon  is  practically  always  reinforced  at  the  latter  reading.  It  is 
practically  never  reinforced  for  the  first  response  when  the  reading  is 
zero.  The  pigeon  therefore  is  able  to,  and  does,  form  a  discrimination. 

We  may  put  this  in  the  form  of  a  rule:  when  responding  rapidly,  continue 
to  do  so  because  your  chances  ere  good;  when  not  responding,  there  is  little 
or  no  reason  to  begin.  We  can  see  this  rule  in  action  if  we  withhold  all 
further  reinforcement.  This  brings  about  the  process  called  extinction,  in 
which  the  rate  passes  from  a  high  initial  value  to  practically  zero.  But 
extinction  after  prolonged  fixed -interval  reinforcement  is  not  a  simple 
process.  Fig.  2  Is  a  typical  example.  The  pigeon  begins  at  a  high  rate. 
This  is  a  speedometer  reading  at  which  reinforcement  has  frequently  been 
received.  The  chances  of  reinforcement  are  good  and  responding  continues 
for  several  minutes.  About  100C  responses  are  emitted  at  this  rate.  Even¬ 
tually  some  sort  of  exhaustion  sets  in  and  the  rate  falls  off,  rather 
quickly,  but  in  an  orderly  fashion,  eventually  to  zero.  This  is  a  reading 
of  the  speedometer  at  which  reinforcements  have  never  been  received;  con¬ 
sequently,  the  pigeon  does  not  resume  responding  for  a  long  timo.  When 
another  response  eventually  occurs  --  for  reasons  which  we  cannot  specify  — 
it  restores  to  some  extent  the  condition  under  which  reinforcement  has  beon 
received.  Another  response  soon  occurs,  and  this  improves  still  further 
the  speedometer  reading.  A  high  rate  is  quickly  reached.  Since  this  is 
again  the  condition  under  which  reinforcements  have  frequently  been  re¬ 
ceived,  responding  is  maintained  for  some  timo.  Another  group  of  approxi¬ 
mately  800  responses  appears.  The  rate  then  falls  smoothly  again  to  zero. 
Curves  of  this  sort  are  satisfactorily  accounted  for  by  assuming,  first, 
that  when  the  organism  is  rosponding  rapidly,  it  creates  an  optimal  stirau- 
la.ting  condition  and,  secondly,  that  when  it  is  not  responding,  the  stimu¬ 
lating  condition  is  minimal. 

A  pigeon  will  continue  to  respond  indefinitely  when  reinforcements  are 
spaced  as  much  ss  45  minutes  apart,  even  though  food  is  received  too  slowly 
to  maintain  body  weight,  so  that  extra  feeding  is  necessary  between  ex¬ 
perimental  periods.  The  behavior  after  each  reinforcement  shows  a  much 
slower  acceleration  from  a  low  to  a  high  rate.  In  extinction,  the  effect 
of  so  If -generated  stimuli  is  seen.  Fig.  5  is  an  examole,  broken  into  two 
segments  to  show  details  more  clearly.  The  pigeon  begins  os  usual  at  a  low 
rate  of  responding  at  A.  It  has  never  been  reinforced  at  the  start  of  the 
experiment  or  immediately  after  another  reinforcement.  A  higher  rate 


develops  smoothly  during  the  first  20  or  30  minutes.  This  part  of  the  curve 
is  a  fair  sample  of  the  behavior  after  each  reinforcement  on  a  45-minute 
schedule.  Eventually  a  rate  is  reached  at  vhich  reinforcements  have  been 
most  often  received.  (This  is  by  no  means  the  highest  rate  of  which  the 
pigeon  is  capable.)  Because  this  is  an  optimal  condition,  the  rate  pre¬ 
vails  for  some  time.  When  the  pigeon  pauses  for  a  few  moments,  howevor 
(at  B)  it  creates  a  condition  which  is  not  optimal  for  reinforcement.  Res¬ 
ponding  is  therefore  not  resumed  for  some  time.  Eventually  another  slow 
acceleration  loads  to  the  same  high  rate.  When  this  is  again  broken  (at  C), 
anothor  period  of  slow  responding  intervenes,  followed  by  anothor  accelera¬ 
tion.  Eventually  the  rate  fells  off  in  extinction.  Although  such  a  curve 
is  complex,  it  is  not  disorderly.  It  is  by  no  means  random  responding. 

Sinco  no  external  condition  changes  during  the  oxporimontal  poriod,  the 
chango  in  rato  must  bo  due  to  conditions  alterod  by  the  bird’s  own  behavior. 
Tho  curvo  can  bo  oxplainod  in  torms  of  tho  self-stimulating  offoct  of 
responding  at  special  rates. 

Wo  can  teat  the  importance  of  tho  passago  of  time  in  accounting  for 
behavior  of  this  sort  by  giving  the  pigeon  an  external  "clock."  One  such 
"clock"  consists  of  a  spot  of  light  projected  upon  tho  key  which  the  pigeon 
pecks.  The  spot  marks  time  by  changing  size.  At  first  it  is  only  l/8th  of  an 
inch  in  longth.  It  grows  to  3/4th  of  an  inch  at  a  given  rate.  The  rosponsc  . 
to  the  key  is  reinforced  whon  the  spot  is  largost.  When  tho  pigeon  returns 
to  tho  key  aftor  roinforcemont,  the  spot  has  again  become  small.  Here  is 
on  external  stimulus,  thon,  roughly  proportional  to  the  time  which  has 
passed  sinco  tho  last  roinforcomont .  Can  it  bo  used  by  tho  pigeon  as  a 
discriminative  stimulus? 

To  avoid  a  disturbing  complication  wo  must  got  the  spot  of  light  into 
tho  oxporimont  before  it  functions  as  a  clock.  Supposo  we  begin  by  holding 
tho  spot  still  at  its  largost  size,  and  build  up  the  usual  fixed- interval 
performance.  In  Fig.  4  the  upper  curvo  shows  a  standard  sample.  Tho  spot 
was  set  at  "large"  and  tho  rocord  is  typical  of  roinforcomont  at  intervals 
of  10  ninutoo.  Wo  now  --  for  tho  first  time  —  chango  tho  size  of  the  spot, 
lotting  it  bogin  at  "small"  to  grow  progressively  larger  during  the  inter¬ 
val.  The  spots  in  tho  circles  abovo  tho  lowor  record  in  Fig.  3  give 
oamplo  readings  of  tho  clock  at  various  positions.  Wo  obsorvo  that  tho 
pigeon  is  sensitively  controlled  by  tho  size.  When  the  spot  is  small  it 
is  most  unliko  its  accustomed  size,  and  the  rate  is  almost  zero.  As  tho 
spot  grows,  tho  similarity  incroasos  and  the  rate  rises.  As  the  spot 
rcachos  its  final  standard  sizo,  tho  rato  has  reached  or  exceeded  tho  value 
at  which  responses  have  boon  reinforced.  Such  a  curvo  is  not  tho  effect  of 
tho  passage  of  time;  it  is  a  roport  of  stimulus  generalization  from  large 
spots  to  smaller  onos. 

Evontually,  however,  the  correlation  botwoen  tho  size  of  tho  spot  and 
tho  passage  of  time  is  felt.  The  pigeon  bogins,  so  tc  speak,  to  "tell 
timo. "  In  Fig.  5  c  scrios  of  records  show  tho  progress  of  a  pigoon  in 
loarning  to  uso  tho  clock  projected  upon  tho  koy.  In  each  sample,  threo 
intervals  ore  shown.  In  Record  1  the  curvature  is  already  somewhat 
sharper  than  in  tho  preceding  figure.  As  tho  pigeon  is  ropentodly  oxpooed 
to  the  changing  spot  and  i3  reinforced  only  when  tho  spot  is  largo,  these 
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gradients  13000130  sharper  still.  By  the  timo  Record  5  is  reached,  the  nigocn 
is  not  responding  for  approx  ion  tely  the  first  7  or  8  ninutes  out  of  each  10. 
By  that  time  the  spot  has  reached  a  size  very  close  to  optimal  and  respond¬ 
ing  then  begins  and  soon  reaches  a  very  high  rate. 

Eventually  the  pigeon  characteristically  waits  fully  3  cut  of  the  10 
minutes  and  responds  at  a  rate  of  4  or  5  responses  per  second  during  the 
remaining  part  of  the  interval.  It  has  formed  a  very  precise  size-dis¬ 
crimination.  This  would  be  the  result  without  an  added  clock  if  the  pigeon 
had  what  we  call  a  precise  "sense  of  time."  But  it  is  obvious  that  the 
unomplified  passage  of  tine  is  very  insignificant  for  the  pigeon  compared 
with  a  physical  clock  of  this  sort. 

The  extent  of  the  control  exorcised  by  the  size  of  the  spot  is  beau¬ 
tifully  illustrated  if  we  withhold  further  reinforcement  while  allowing 
the  clock  to  run,  repeating  cycle  after  cycle  of  the  growth  of  the  spot 
from  small  to  largo  as  in  Fig.  6.  The  pigeon  continues  not  to  respond 
during  all  sizes  of  the  spot  except  those  close  to  the  value  which  has 
previously  obtained  at  reinforcement.  As  repeated  responses  go  unrein¬ 
forced,  however,  the  amount  of  responding  to  the  high  value  progressively 
decreases.  The  extent  of  the  control  exercised  by  the  spot  can  be  shown 
in  many  other  ways.  We  discovered  one  of  these  by  accident.  Our  experi¬ 
ments  are  fully  automatic  and  our  apparatus  is  used  24  hours  of  the  day. 
When  wo  reached  the  laboratory  one  morning,  we  found  thot  a  pigeon  had  n  t 
responded  all  night  long.  Investigation  showed  that  through  an  oversight 
the  clock  had  not  been  started.  The  snot  had  remained  at  its  smallest 
size  for  15  hours.  During  this  time  the  pigeon  had  not  made  a  single 
response  to  the  key.  Another  pigeon  tested  with  the  clock  stationary  at 
"smell"  waitod  5  hours  before  responding.  It  then  responded  once  and  was 
reinforced.  What  honponed  is  shown  in  Fig.  7.  The  first  single  reinforce¬ 
ment  raised  the  rate  of  responding  from  practically  zero  to  a  definite  and 
fairly  stable  value  shown  at  B.  The  clock  remained  "small,"  of  course. 

After  10  minutes  the  pigeon  was  reinforced  again,  whereupon  the  rote  rose 
practically  to  its  normal  value  without  benefit  of  clock  (C).  The  record 
is  a  good  example  of  the  effectiveness  of  single  reinforcements.  Only  two 
reinforcements  were  necessary  to  rostore  the  normal  rate,  and  each  did 
about  half  the  job. 

At  the  other  extreme,  wc  can  show  enormous  power  of  the  clock  stopped 
at  its  lajpgost  sizo.  Fig.  3  is  a  typical  caso.  The  record  begins  with 
tbroo  intervals  of  reinforcement  during  which  the  clock  was  running  as  be¬ 
fore.  The  clock  then  remained  at  "largo."  During  the  next  10  minutes  the 
bird  responded  nearly  2000  times.  It  was  then  reinforced,  the  spot  con¬ 
tinuing  at  largo.  The  rate  eventually  fell  off  and.,  though  not  shown  in 
the  figure,  it  eventually  reached  the  normal  value  under  10-minute  rein¬ 
forcement  without  benefit  of  clock.  When  time  has  been,  so  to  sneak, 
externalized  in  this  way,  it  nay  be  manipulated.  For  example,  our  clock 
may  bo  made  to  run  fast  or  slow.  In  one  experiment,  various  speeds  of 
"time"  were  introduced  at  random  in  successive  intervals.  The  clock  might 
complete  one  cycle  in,  say,  3  minutes,  at  the  end  of  which  time  a  response 
would  be  reinforced,  whereupon  the  next  cycle  might  require  lb  minutes,  end 
so  on.  The  extent  of  the  control  exercised  over  the  bird's  behavior  is 
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seen  in  Fig.  9  where  typical  performances  for  a  range  of  clock  speeds 
between  1  cycle  in  3  minutes  and  1  cycle  in  32  minutes  are  shown.  The 
rate  of  responding  is  roughly  the  same  for  a  given  size  of  spot  regardless 
of  speed  of  change.  The  curve  at  32  minutes  is  obviously  not  approximately 
10  times  as  high  as  that  in  3  minutes,  however,  as  it  should  be  if  the  con¬ 
trol  by  the  spot  were  strictly  equivalent  in  both  cases. 

It  is  also  possible  to  run  externalized  time  backward.  Our  first 
experiment  of  this  sort  was  again  an  accident.  The  bird  was  being  studied 
with  a  3-minute  clock  and  was  responding  as  shown  at  the  left  in  Fig.  10. 

The  next  day,  through  an  oversight,  the  clock  was  run  backward.  It  began 
large  and  grew  small.  The  first  3  segments  of  the  second  curve  in  Fig.  10 
ore  essentially  inversions  of  the  segments  of  the  other  curve.  Since  the 
bird  was  now  reinforced  when  the  spot  was  small,  however,  a  new  pattern 
quickly  arose.  The  curve  becomes  essentially  linear  and  at  a  later  stage, 
not  shown  in  the  graph,  the  usual  performance  with  a  clock  develops. 

Whether  the  spot  of  light  is  to  grow  or  shrink  with  time  is,  of  course, 
arbitrary,  and  the  bird  will  adjust  to  either  case. 

We  may  eliminate  the  effect  of  time  by  adopting  a  different  schedule. 
Reinforcements  are  still  controlled  by  a  clock,  but  the  intervals  between 
them  are  varied,  roughly  at  random,  within  certain  limits  and  with  a  given 
mean.  In  such  a  case  the  bird  cannot  predict,  so  to  speak,  when  the  next 
reinforcement  is  to  be  received.  This  is  called  a  variable -interval  rein¬ 
forcement.  The  effect  is  a  uniform  rate  of  responding  with  great  stability, 
which  may  be  maintained  for  many  hours.  Fig.  11  shows  a  record  in  which 
the  actual  intervals  ranged  between  a  few  seconds  and  6  minutes  in  length. 

The  randomized  reinforcements  are  marked  by  small  cross-dashes.  Delays 
following  reinforcement  are  lacking.  Although  the  rate  slightly  changes 
from  time  to  time,  there  is  no  pause  as  long  as  10  seconds.  The  record, 
which  is  typical,  covers  a  period  of  more  than  2  hours.  The  control  ex¬ 
ercised  by  a  schedule  of  this  sort  may  be  very  great.  During  a  single  ex¬ 
perimental  record  of  15  hours  a  bird  responded  30,000  times.  During  this 
period  the  bird  received  less  than  its  daily  ration  of  food.  Toward  the 
end  of  the  record  there  was  one  pause  approximately  1  minute  long,  but  other¬ 
wise,  the  bird  did  not  pause  longer  than  15  seconds  at  any  time  during 
the  15  hours. 

We  turn  now  to  an  entirely  different  type  of  schedule.  The  moment 
at  which  a  response  is  to  be  reinforced  may  be  determined  by  the  behavior 
of  the  organism  itself.  For  example,  we  may  reinforce  every  fifth  response, 
every  fiftieth  response,  or  every  two  hundredth  response.  We  call  this 
fixed -ratio  reinforcement,  meaning  that  the  ratio  of  unreinforced  to  rein¬ 
forced  responses  remains  fixed.  In  industry,  this  is  called  a  piece-work 
basis  of  pay;  the  worker  is  paid  in  terms  of  his  productivity.  The  pigeon's 
behavior  under  such  a  schedule  is  not  too  difficult  to  interpret.  Fig.  12 
shows  a  short  segment  of  a  characteristic  performance.  A  response  is  re¬ 
inforced  every  time  the  pigeon  completes  a  group  of  200  responses.  Where 
we  represented  a  fixed- interval  reinforcement  by  drawing  vertical  lines  on 
our  cumulative  graph,  here  we  represent  fixed-ratio  reinforcement  with  a 
series  of  horizontal  lines.  Whenever  the  curve  reaches  one  of  these  lines 
the  response  reinforced,  no  wetter  how  much  time  has  elapsed. 
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The  result  is  typically  a  series  of  gradients.  Immediately  after  re¬ 
inforcement  a  low  rate  of  responding  prevails;  just  before  reinforcement, 
a  high  rate.  The  transition  from  one  to  the  other  is  characteristically 
slower  than  under  fixed- interval  reinforcement.  This  difference  appears 
to  be  due  to  another  source  of  stimulation  available  under  fixed-ratio  re¬ 
inforcement.  In  addition  to  a  clock  and  a  speedometer,  the  pigeon  pre¬ 
sumably  has  a  "counter"  which  tells  it  how  many  responses  it  has  made 
since  the  previous  reinforcement. 

An  increase  in  its  counter  reading  may  be  immediately  reinforcing 
to  the  pigeon.  One  way  to  test  this  is  to  add  an  external  counter  compar¬ 
able  to  the  external  clock.  The  spot  of  light  on  the  key  is  made  to  grow, 
not  with  the  passage  of  time,  but  with  the  accumulation  of  responses.  If 
the  pigeon  does  not  respond,  the  spot  remains  stationary.  With  each  res¬ 
ponse  it  grows  a  small  amount.  The  effect  of  this  externalized  oounter  is 
dramatic.  In  one  experiment  the  pigeon  was  being  reinforced  approximately 
every  70  responses.  It  was  proceeding  at  an  overall  speed  of  about  6,000 
responses  per  hour.  As  soon  as  a  spot  of  light  was  added  to  the  key,  in 
such  a  way  that  it  grew  from  "small"  to  "large"  as  the  effect  of  70  res¬ 
ponses,  the  rate  went  up  almost  immediately  to  20,000  responses  per  hour  as 
in  Fig.  13.  Performance  under  fixed-ratio  reinforcement  in  the  absence 
of  an  external  counter  is  presumably  of  the  same  sort,  except  that  the 
pigeon's  own  counter  is  much  less  effective  than  the  spot  of  light.  It 
is  possible  to  carry  a  pigeon  to  a  high  ratio  without  introducing  appre¬ 
ciable  pauses  after  reinforcement,  but  this  process  is  slow  and  must  be 
carried  out  with  great  care  as  the  pigeon  is  made  sensitive  to  its  own 
counter. 

We  can  prove  that  the  pigeon  is,  so  to  speak,  counting  its  responses 
by  setting  up  a  two-valued  schedule  of  reinforcement.  We  reinforce  the 
50th  response  after  the  preceding  reinforcement  or  the  250th,  and  we  ar¬ 
range  our  program  in  such  a  way  that  there  is  no  indication  in  advance 
which  ratio  is  to  prevail.  In  such  a  case,  the  pigeon  develops  a  step- 
like  curve  appropriate  to  a  ratio  of  50  to  1.  But  it  shows  this,  of  course, 
even  when  the  ratio  is  actually  250  to  1.  In  Fig.  l4,  for  example,  the 
segments  at  A,  B  and  C  show  either  two  or  three  waves  which  are  the  grad¬ 
ients  prevailing  under  a  reinforcement  of  50  to  1.  That  is  to  say,  the 
pigeon  begins  as  if  the  ratio  were  to  be  50  to  1.  But  after  60  or  75 
responses  have  been  completed  there  is  a  marked  decrease  in  rate  which  can 
only  be  explained  by  assuming  that  the  bird,  so  to  speak,  knows  the  score. 

A  short  period  of  slow  responding  follows.  This  gives  way  to  a  second 
gradient,  again  roughly  of  the  order  prevailing  under  50  to  1  reinforcement. 
This  may  even  be  followed  by  a  third  gradient  before  reinforcement  is  re¬ 
ceived  at  the  250th  response.  If,  as  at  D,  we  simply  withhold  all  reinforce 
ments,  an  extinction  curve  emerges  in  the  form  of  a  series  of  waves,  averag¬ 
ing  approximately  50  responses  each.  This  cannot  be  due  to  the  mere  passage 
of  time,  since  time  does  not  show  a  wave-like  character.  It  cannot  be  due 
to  a  discrimination  based  upon  the  rate  of  responding  because  this  should 
lead  to  long  segments  at  a  high  rate,  as  in  both  fixed- interval  and  fixed- 
ratio  reinforcement.  We  have,  then,  to  take  into  account  a  third  source 
of  automatic  stimulation  at  the  moment  of  reinforcement  provided  by  a 
"counter. " 
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We  can  eliminate  the  "counter"  by  randomising  a.  schedule  of  many 
different  ratios.  Fig.  15,  for  example,  gives  a  typical  record  obtained 
under  what  we  may  call  "variable-ratio  reinforcement. "  A  response  was 
reinforced  on  the  average  every  110  responses,  but  in  actual  practice  the 
very  next  response  or  a  response  as  many  as  500  responses  later  might  have 
been  reinforced.  The  schedule  produces  a  very  high  rate  of  responding, 
sustained  for  long  periods  of  time,  showing  none  of  the  oscillations  in 
rate  characteristic  of  fixed-ratio  reinforcement.  The  rate  shown  hero  is 
approximately  12,000  responses  per  hour. 

This  variable-ratio  schedule  is  familiar  to  everyone,  because  it  is 
the  fundamental  feature  of  all  gambling  devices.  The  pigeon  responsible 
for  Fig.  15  is  not  far  removed  from  the  pathological  gambler.  Variable- 
ratio  reinforcement  engages  end  holds  the  behavior  of  the  organism  with 
particular  power.  The  magnitude  of  its  control  is  seen  when  we  extinguish 
the  response.  Fig.  1 6  is  an  extinction  curve  obtained  after  the  variable- 
ratio  reinforcement  shown  in  the  preceding  figure.  The  curve  has  been 
broken  into  consecutive  segments  in  order  to  avoid  undue  reduction.  The 
curve  begins  with  a  long  run  of  approximately  7,5 00  responses  during  which 
there  is  no  appreciable  retardation.  The  remainder  of  the  curve  is  also 
illuminating.  After  short  periods  of  slow  responding  the  pigeon  returns 
again  and  again  to  the  original  rate,  which  as  the  prevailing  condition 
at  previous  reinforcements,  tends  to  perpetuate  itself.  But  the  effects 
of  variable- interval  and  variable-ratio  reinforcement  are  very  different, 
because  the  two  schedules  lead  to  different  relations  between  reinforce¬ 
ment  and  the  fine  "groin"  of  the  record.  When  reinforcement  is  arranged 
by  a  clock,  the  clock  runs  whether  or  not  the  pigeon  is  responding.  The 
probability  of  reinforcement  therefore  increases  during  any  pause.  A 
response  following  a  pause  is  especially  likely  to  be  reinforced.  Under 
variable-ratio  reinforcement,  however,  a  pause  does  not  alter  the  chances 
of  reinforcement.  There  is  no  special  likelihood  that  the  first  response 
made  after  a  pause  will  be  reinforced.  On  the  contrary  it  is  likely 
that  a  response  occurring  during  a  short  burst  will  be  reinforced,  es¬ 
pecially  because  a  short  burst  is  likely  to  be  executed  in  its  entirety 
before  the  reinforcement  achieved  by  any  one  of  its  members  is  actually 
received. 

We  confirm  this  explanation  by  deliberately  controlling  the  "fine 
grain  effect."  We  arrange  that  a  response  will  be  reinforced  only  if  it 
has  been  immediately  preceded  by  a  given  number  of  responses  during  a 
given  period  of  time.  By  insisting  upon  rapid  responding  in  this  way  the 
rate  under  variable -interval  reinforcement  can  be  made  to  reach  or  even 
exceed  the  rate  observed  under  variable -ratio  reinforcement.  Contrary- 
wise,  we  can  introduce  into  our  apparatus  a  device  which  insures  that  no 
response  will  be  reinforced  if  it  has  been  preceded  by  another  response 
during  a  given  interval  of  time.  We  insist  upon  slow  responding.  The 
effect  of  this  upon  variable- interval  responding  is  clear  cut. 

fibre  the  steeper  curve  shows  a  typical  performance  under  variable- 
interval  reinforcement.  The  other  curve  shows  the  performance  of  the  same 
pigeon  when  a  device  has  been  introduced  which  prevents  the  reinforcement 
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of  a  response  if  it  has  been  preceded  by  another  response  within  6  seconds. 
The  overall  rate  of  responding  is  simply  reduced.  The  final  slope  depends 
upon  the  specified  pause  which  is  to  precede  the  reinforced  response. 

Comparable  results  have  been  obtained  for  the  other  conditions  specified 
above.  Since  the  significance  of  these  results,  however,  can  be  seen  only  when 
a  unitary  formulation  of  all  schedules  is  possible,  a  report  of  details  at  this 
stage  would  be  relatively  meaningless  and  of  no  particular  value.  As  already 
noted,  the  undersigned  have  in  progress  a  full  length  book-manuscript  to  be 
published  by  the  Macmillan  Company  which  will  bring  together  all  these  details 
under  a  unified  theory  of  the  effect  of  intermittent  reinforcement. 


B.  F.  Skinner,  Director 


C.  B.  Ferster,  Research  Fellow 
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